Granular Box Regression Methods for Outlier Detection
نویسندگان
چکیده
Granular computing (GrC) is an emerging computing paradigm of information processing. It concerns the processing of complex information entities called information granules, which arise in the process of data abstraction and derivation of knowledge from information. Granular computing is more a theoretical perspective, it encourages an approach to data that recognizes and exploits the knowledge present in data at various levels of resolution or scales. Granular computing provides a rich variety of algorithms including methods derived from interval mathematics, fuzzy and rough sets and others. Within this framework granular box regression was proposed recently. The core idea of granular box regression is to determine a fuzzy graph by embedding a given dataset into a predefined number of “boxes”. Granular box regression utilizes intervals a challenge is the detection of outliers. In this paper, we propose borderline method and residual method to detect outliers in granular box regression. We also apply these methods to artificial as well as to real data of motor insurance.
منابع مشابه
Outlier Detection using Granular Box Regression Methods
Granular computing (GrC) is an emerging computing paradigm of information processing. It concerns the processing of complex information entities called information granules, which arise in the process of data abstraction and derivation of knowledge from information. Granular computing is more a theoretical perspective, it encourages an approach to data that recognizes and exploits the knowledge...
متن کاملOutlier Detection Based on Granular Computing
As an emerging conceptual and computing paradigm of information processing, granular computing has received much attention recently. Many models and methods of granular computing have been proposed and studied. Among them was the granular computing model using information tables. In this paper, we shall demonstrate the application of this granular computing model for the study of a specific dat...
متن کاملOutlier Detection by Boosting Regression Trees
A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...
متن کاملSelection of Best Outlier Detection Method Using Regression Analysis
Outliers are unusual data values that are inconsistent with most of the records. Such non-representative records can seriously affect the model to be produced, so detecting outlier is a significant job to achieve higher accuracy. Several outlier detection methods are used in literature for real as well as simulated data sets. The aim of this study is to compare the two outlier detection method ...
متن کاملOutlier Detection Methods in Multivariate Regression Models
Outlier detection statistics based on two models, the case-deletion model and the mean-shift model, are developed in the context of a multivariate linear regression model. These are generalizations of the univariate Cook’s distance and other diagnostic statistics. Approximate distributions of the proposed statistics are also obtained to get suitable cutoff points for significance tests. In addi...
متن کامل